智能论文笔记

GPT-3-driven pedagogical agents for training children's curious question-asking skills

Rania Abdelghani , Yen-Hsiang Wang , Xingdi Yuan , Tong Wang , Hélène Sauzéon , Pierre-Yves Oudeyer

分类：自然语言处理

2022-11-25

Students' ability to ask curious questions is a crucial skill that improves their learning processes. To train this skill, previous research has used a conversational agent that propose specific cues to prompt children's curiosity during learning. Despite showing pedagogical efficiency, this method is still limited since it relies on generating the said prompts by hand for each educational resource, which can be a very long and costly process. In this context, we leverage the advances in the natural language processing field and explore using a large language model (GPT-3) to automate the generation of this agent's curiosity-prompting cues to help children ask more and deeper questions. We then used this study to investigate a different curiosity-prompting behavior for the agent. The study was conducted with 75 students aged between 9 and 10. They either interacted with a hand-crafted conversational agent that proposes "closed" manually-extracted cues leading to predefined questions, a GPT-3-driven one that proposes the same type of cues, or a GPT-3-driven one that proposes "open" cues that can lead to several possible questions. Results showed a similar question-asking performance between children who had the two "closed" agents, but a significantly better one for participants with the "open" agent. Our first results suggest the validity of using GPT-3 to facilitate the implementation of curiosity-stimulating learning technologies. In a second step, we also show that GPT-3 can be efficient in proposing the relevant open cues that leave children with more autonomy to express their curiosity.

translated by 谷歌翻译

Selecting Better Samples from Pre-trained LLMs: A Case Study on Question Generation

Xingdi Yuan , Tong Wang , Yen-Hsiang Wang , Emery Fine , Rania Abdelghani , Pauline Lucas , Hélène Sauzéon , Pierre-Yves Oudeyer

分类：自然语言处理

2022-09-22

近年来，大型语言模型（LLMS）在自然语言产生中表现出了令人印象深刻的实力。提高发电多样性的一种常见做法是从模型中采样多个输出。但是，缺乏一种简单且可靠的方式来从这些随机样品中选择最佳输出。作为一个案例研究，在问题产生的背景下，我们提出了两种基于迅速的方法，以从一组LLM生成的候选人中选择高质量问题。我们的方法在1）限制下起作用，一个黑框（不可修改）问题生成模型和2）缺乏访问人类宣传的参考文献 - 这两者都是现实世界中LLMS的现实局限性。通过自动和人类评估，我们从经验上证明，我们的方法可以有效地选择比贪婪的生成更高质量的问题。

translated by 谷歌翻译

General-to-Specific Transfer Labeling for Domain Adaptable Keyphrase Generation

Rui Meng , Tong Wang , Xingdi Yuan , Yingbo Zhou , Daqing He

分类：自然语言处理

2022-08-20

训练键形生成（KPG）模型需要大量注释的数据，这些数据可能非常昂贵，并且通常仅限于特定域。在这项研究中，我们首先证明了不同领域之间的巨大分布变化极大地阻碍了KPG模型的可传递性。然后，我们提出了一条三阶段的管道，该管道逐渐以数据效率的方式指导KPG模型从一般句法特征到与域相关的语义的学习重点。借助域将军短语预训练，我们使用通用短语注释进行预训练序列到序列模型，这些模型在网络上广泛使用，这使模型能够在广泛的域中生成短语。然后将所得模型应用于传输标签阶段，以产生域特异性伪键形，这有助于将模型适应新域。最后，我们使用有限的数据将模型微调，以完全适应目标域。我们的实验结果表明，所提出的过程可以在新领域中产生高质量的钥匙串，并在适应有限的域注释数据后进行一致的改进。

translated by 谷歌翻译

Asking for Knowledge: Training RL Agents to Query External Knowledge Using Language

Iou-Jen Liu , Xingdi Yuan , Marc-Alexandre Côté , Pierre-Yves Oudeyer , Alexander G. Schwing

分类：人工智能 | 自然语言处理

2022-05-12

为了解决艰巨的任务，人类提出问题以从外部来源获取知识。相反，经典的加强学习者缺乏这种能力，并且常常诉诸探索性行为。这会加剧，因为很少的当今环境支持查询知识。为了研究如何通过语言教授代理来查询外部知识，我们首先介绍了两个新环境：基于网格世界的Q-babyai和基于文本的Q-Textworld。除了物理互动外，代理还可以查询专门针对这些环境的外部知识源来收集信息。其次，我们提出了“寻求知识”（AFK）代理，该代理学会生成语言命令以查询有助于解决任务的有意义的知识。 AFK利用非参数记忆，指针机制和情节探索奖金来解决（1）无关的信息，（2）一个较大的查询语言空间，（3）延迟奖励有意义的查询。广泛的实验表明，AFK代理在具有挑战性的Q-Babyai和Q-Textworld环境方面优于最近的基线。

translated by 谷歌翻译

RELIANT: Fair Knowledge Distillation for Graph Neural Networks

Yushun Dong , Binchi Zhang , Yiling Yuan , Na Zou , Qi Wang , Jundong Li

分类：机器学习

2023-01-03

Graph Neural Networks (GNNs) have shown satisfying performance on various graph learning tasks. To achieve better fitting capability, most GNNs are with a large number of parameters, which makes these GNNs computationally expensive. Therefore, it is difficult to deploy them onto edge devices with scarce computational resources, e.g., mobile phones and wearable smart devices. Knowledge Distillation (KD) is a common solution to compress GNNs, where a light-weighted model (i.e., the student model) is encouraged to mimic the behavior of a computationally expensive GNN (i.e., the teacher GNN model). Nevertheless, most existing GNN-based KD methods lack fairness consideration. As a consequence, the student model usually inherits and even exaggerates the bias from the teacher GNN. To handle such a problem, we take initial steps towards fair knowledge distillation for GNNs. Specifically, we first formulate a novel problem of fair knowledge distillation for GNN-based teacher-student frameworks. Then we propose a principled framework named RELIANT to mitigate the bias exhibited by the student model. Notably, the design of RELIANT is decoupled from any specific teacher and student model structures, and thus can be easily adapted to various GNN-based KD frameworks. We perform extensive experiments on multiple real-world datasets, which corroborates that RELIANT achieves less biased GNN knowledge distillation while maintaining high prediction utility.

translated by 谷歌翻译

High-Quality Supersampling via Mask-reinforced Deep Learning for Real-time Rendering

Hongliang Yuan , Boyu Zhang , Mingyan Zhu , Ligang Liu , Jue Wang

分类：计算机视觉

2023-01-03

To generate high quality rendering images for real time applications, it is often to trace only a few samples-per-pixel (spp) at a lower resolution and then supersample to the high resolution. Based on the observation that the rendered pixels at a low resolution are typically highly aliased, we present a novel method for neural supersampling based on ray tracing 1/4-spp samples at the high resolution. Our key insight is that the ray-traced samples at the target resolution are accurate and reliable, which makes the supersampling an interpolation problem. We present a mask-reinforced neural network to reconstruct and interpolate high-quality image sequences. First, a novel temporal accumulation network is introduced to compute the correlation between current and previous features to significantly improve their temporal stability. Then a reconstruct network based on a multi-scale U-Net with skip connections is adopted for reconstruction and generation of the desired high-resolution image. Experimental results and comparisons have shown that our proposed method can generate higher quality results of supersampling, without increasing the total number of ray-tracing samples, over current state-of-the-art methods.

translated by 谷歌翻译

PanopticPartFormer++: A Unified and Decoupled View for Panoptic Part Segmentation

Xiangtai Li , Shilin Xu , Yibo Yang , Haobo Yuan , Guangliang Cheng , Yunhai Tong , Zhouchen Lin , Dacheng Tao

分类：计算机视觉

2023-01-03

Panoptic Part Segmentation (PPS) unifies panoptic segmentation and part segmentation into one task. Previous works utilize separated approaches to handle thing, stuff, and part predictions without shared computation and task association. We aim to unify these tasks at the architectural level, designing the first end-to-end unified framework named Panoptic-PartFormer. Moreover, we find the previous metric PartPQ biases to PQ. To handle both issues, we make the following contributions: Firstly, we design a meta-architecture that decouples part feature and things/stuff feature, respectively. We model things, stuff, and parts as object queries and directly learn to optimize all three forms of prediction as a unified mask prediction and classification problem. We term our model as Panoptic-PartFormer. Secondly, we propose a new metric Part-Whole Quality (PWQ) to better measure such task from both pixel-region and part-whole perspectives. It can also decouple the error for part segmentation and panoptic segmentation. Thirdly, inspired by Mask2Former, based on our meta-architecture, we propose Panoptic-PartFormer++ and design a new part-whole cross attention scheme to further boost part segmentation qualities. We design a new part-whole interaction method using masked cross attention. Finally, the extensive ablation studies and analysis demonstrate the effectiveness of both Panoptic-PartFormer and Panoptic-PartFormer++. Compared with previous Panoptic-PartFormer, our Panoptic-PartFormer++ achieves 2% PartPQ and 3% PWQ improvements on the Cityscapes PPS dataset and 5% PartPQ on the Pascal Context PPS dataset. On both datasets, Panoptic-PartFormer++ achieves new state-of-the-art results with a significant cost drop of 70% on GFlops and 50% on parameters. Our models can serve as a strong baseline and aid future research in PPS. Code will be available.

translated by 谷歌翻译

CLIP-Driven Universal Model for Organ Segmentation and Tumor Detection

Jie Liu , Yixiao Zhang , Jie-Neng Chen , Junfei Xiao , Yongyi Lu , Bennett A. Landman , Yixuan Yuan , Alan Yuille , Yucheng Tang , Zongwei Zhou

分类：计算机视觉 | 机器学习

2023-01-02

An increasing number of public datasets have shown a marked clinical impact on assessing anatomical structures. However, each of the datasets is small, partially labeled, and rarely investigates severe tumor subjects. Moreover, current models are limited to segmenting specific organs/tumors, which can not be extended to novel domains and classes. To tackle these limitations, we introduce embedding learned from Contrastive Language-Image Pre-training (CLIP) to segmentation models, dubbed the CLIP-Driven Universal Model. The Universal Model can better segment 25 organs and 6 types of tumors by exploiting the semantic relationship between abdominal structures. The model is developed from an assembly of 14 datasets with 3,410 CT scans and evaluated on 6,162 external CT scans from 3 datasets. We rank first on the public leaderboard of the Medical Segmentation Decathlon (MSD) and achieve the state-of-the-art results on Beyond The Cranial Vault (BTCV). Compared with dataset-specific models, the Universal Model is computationally more efficient (6x faster), generalizes better to CT scans from varying sites, and shows stronger transfer learning performance on novel tasks. The design of CLIP embedding enables the Universal Model to be easily extended to new classes without catastrophically forgetting the previously learned classes.

translated by 谷歌翻译

A Concept Knowledge Graph for User Next Intent Prediction at Alipay

Yacheng He , Qianghuai Jia , Lin Yuan , Ruopeng Li , Yixin Ou , Ningyu Zhang

分类：自然语言处理 | 人工智能 | 机器学习

2023-01-02

This paper illustrates the technologies of user next intent prediction with a concept knowledge graph. The system has been deployed on the Web at Alipay, serving more than 100 million daily active users. Specifically, we propose AlipayKG to explicitly characterize user intent, which is an offline concept knowledge graph in the Life-Service domain modeling the historical behaviors of users, the rich content interacted by users and the relations between them. We further introduce a Transformer-based model which integrates expert rules from the knowledge graph to infer the online user's next intent. Experimental results demonstrate that the proposed system can effectively enhance the performance of the downstream tasks while retaining explainability.

translated by 谷歌翻译

EvidenceCap: Towards trustworthy medical image segmentation via evidential identity cap

Ke Zou , Xuedong Yuan , Xiaojing Shen , Yidi Chen , Meng Wang , Rick Siow Mong Goh , Yong Liu , Huazhu Fu

分类：计算机视觉

2023-01-01

Medical image segmentation (MIS) is essential for supporting disease diagnosis and treatment effect assessment. Despite considerable advances in artificial intelligence (AI) for MIS, clinicians remain skeptical of its utility, maintaining low confidence in such black box systems, with this problem being exacerbated by low generalization for out-of-distribution (OOD) data. To move towards effective clinical utilization, we propose a foundation model named EvidenceCap, which makes the box transparent in a quantifiable way by uncertainty estimation. EvidenceCap not only makes AI visible in regions of uncertainty and OOD data, but also enhances the reliability, robustness, and computational efficiency of MIS. Uncertainty is modeled explicitly through subjective logic theory to gather strong evidence from features. We show the effectiveness of EvidenceCap in three segmentation datasets and apply it to the clinic. Our work sheds light on clinical safe applications and explainable AI, and can contribute towards trustworthiness in the medical domain.

translated by 谷歌翻译